Experiments with a Gaussian Merging-Splitting Algorithm for HMM Training for Speech Recognition

نویسنده

  • Ananth Sankar
چکیده

It is well known that the expectation-maximization (EM) algorithm, commonly used to estimate hidden Markov model (HMM) parameters for speech recognition, is sensitive to the initial model parameter values, making appropriate parameter initialization important. We investigate the use of iterative Gaussian splitting and EM training to initialize the desired number of Gaussians per HMM state (or state cluster). We then study merging of Gaussians which contain little training data as an approach to robust parameter estimation. Finally Gaussian merging and splitting is combined to form the Gaussian Merging-Splitting (GMS) algorithm. Detailed experimental studies show that Gaussian splitting gives similar performance to our previous training algorithm, even though the two algorithms give very different parameter values. The robust parameter estimation from Gaussian merging results in better performance than our old algorithm for speaker-independent models that have a large number of parameters relative to the amount of training data. In addition, in one experiment, speaker adaptation gave a 7% relative improvement in word error rate over the SI models when the SI models were trained with Gaussian splitting alone, as compared to a 15.5% improvement when the SI models were trained with both splitting and merging, even though the unadapted SI models gave similar performance. Finally, experiments with the GMS algorithm show that, for a given number of Gaussian parameters, better performance is achieved by reducing the number of HMM state clusters and increasing the number of Gaussians per state cluster.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust HMM estimation with Gaussian merging-splitting and tied-transform HMMs

We present two different approaches for robust estimation of the parameters of context-dependent hidden Markov models (HMMs) for speech recognition. The first approach, the Gaussian MergingSplitting (GMS) algorithm, uses Gaussian splitting to uniformly distribute the Gaussians in acoustic space, and merging so as to compute only those Gaussians that have enough data for robust estimation. We sh...

متن کامل

Improving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM

Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...

متن کامل

Speech enhancement based on hidden Markov model using sparse code shrinkage

This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998